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Abstract 

We  propose  to  estimate  time-varying  frequency-selective  channels  using  data- dependent  super¬ 
imposed  training  (DDST)  and  a  basis  expansion  model  (BEM).  The  superimposed  training 
consists  of  the  sum  of  a  known  sequence  and  a  data-dependent  sequence,  which  is  unknown  to 
the  receiver.  The  data-dependent  sequence  cancels  the  effects  of  the  unknown  data  on  channel 
estimation.  Symbol  detection  is  performed  using  MMSE  equalization.  The  method  is  compared 
to  time-division- multiplexing-based  methods. 


0.1  Introduction 


Wireless  and  mobile  communications  channels  for  high  data  rate  transmission  are  typically  time 
and  frequency  selective.  Frequency-selectivity  is  due  to  multipath  propagation  and  large  signal 
bandwidth  whereas  time-selectivity  is  induced  by  Doppler.  Such  doubly  selective  channels  offer 
joint  multipath-Doppler  diversity  gains  [2],  However,  achieving  such  gains  requires  channel 
acquisition,  which  is  a  challenging  task.  Further,  when  the  channel  is  fast  fading,  the  common 
approach  of  assuming  the  channel  to  be  quasi-static  over  a  certain  interval  of  time  may  lead 
to  unacceptable  system  performance.  Thus,  accurate  estimation  of  doubly-selective  channels  is 
well  motivated. 

In  most  practical  systems,  training  is  used  to  facilitate  channel  estimation.  Blind  techniques 
typically  require  long  data  records  and  are  often  complex  to  implement.  The  conventional 
way  of  multiplexing  training  symbols  with  the  data  is  time-division  multiplexing  (TDM)  [3,  4]. 
In  the  case  of  purely  time-selective  channels,  periodic  insertion  of  training  symbols,  known  as 
pilot  symbol  aided  modulation  (PSAM),  was  shown  to  be  optimal  in  the  sense  of  minimizing 
the  mean  square  error  (MSE)  of  channel  estimation.  For  purely  frequency-selective  channels, 
periodic  insertion  of  pilot  clusters  was  shown  to  be  optimal  [4].  For  doubly  selective  channels 
and  zero-padded  block  transmission,  using  the  basis  expansion  model  (BEM)  of  [1,  2],  it  was 
shown  in  [14]  that  periodic  insertion  of  zero-guarded  pilot  symbols  was  optimal.  For  cyclic- 
prefixed  systems,  orthogonal  multiplexing  is  used  in  the  frequency  domain  [12],  An  alternative 
approach  to  orthogonal  multiplexing  schemes  is  superimposed  training  (ST).  This  scheme  saves 
valuable  bandwidth  at  the  expense  of  a  reduction  in  the  information  signal-to-noise  ratio  (SNR) , 
since  some  of  the  transmitted  energy  is  allocated  to  the  embedded  pilots.  In  the  case  of  purely 
time-selective  channels,  it  was  shown  in  [13]  that  ST  outperforms  PSAM  when  the  fading  is  fast. 
ST  schemes  have  also  been  proposed  in  [9,  10].  The  main  drawback  of  such  a  scheme  is  that 
performance  of  a  channel  estimator  is  limited  by  the  unknown  data  which  act  as  a  source  of  input 
noise.  To  circumvent  this,  a  variant  of  the  ST  scheme,  called  data-dependent  ST  (DDST)  was 
proposed  in  [8,  11]  for  purely  frequency-selective  channels.  Unlike  the  conventional  ST  scheme, 
the  training  sequence  in  the  DDST  method  was  set  to  be  the  sum  of  a  known  (to  the  receiver) 
sequence,  and  a  data-dependent  sequence,  which  is  unknown  at  the  receiver.  Here,  we  extend 
this  method  to  include  time  and  frequency-selective  channels.  Towards  this  objective,  we  use 
the  basis  expansion  model  (BEM)  [1,  2]  which  has  been  used  to  approximate  doubly  selective 
channels. 

The  report  is  organized  as  follows.  The  next  section  describes  the  system  model.  Channel 
estimation  is  presented  in  Section  3.  The  issue  of  optimum  training  design  is  addressed  for 
the  proposed  pilot  assisted  transmission  in  Section  4.  Equalization  and  symbol  detection  are 
explained  in  Section  6.  Simulations  results  are  presented  in  Section  7  and  conclusions  are  drawn 
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in  Section  8. 

Notation:  Superscripts  *,  T  and  1  denote  Hermitian,  transpose  and  pseudo-inverse  opera¬ 
tors.  The  trace  and  statistical  expectation  are  denoted  by  Tr{-}  and  E{-}.  The  nth  element 
of  a  vector  z  is  denoted  by  z(n).  The  (N  x  N)  identity  matrix  is  denoted  by  I.  Finally, 
diag(ai,  ...,a/v)  is  the  (N  x  N)  diagonal  matrix  whose  nth  diagonal  entry  is  an.  A  matrix  of 
zero  will  be  denoted  by  0.  The  symbol  oc  will  mean  ’’proportional  to”. 

0.2  System  Model 

Consider  a  doubly-selective  communication  link  and  let  h(t\  r)  denote  its  time-varying  impulse 
response  which  includes  the  doubly-selective  channel  as  well  as  the  transmit-receive  filters.  Let 
denote  the  Fourier  transform  of  h(t;r).  Let  us  define  the  delay  spread  rmax  and  the 
Doppler  spread  /max  as  the  thresholds  on  r  and  /  beyond  which  \H(f;  r)|  «  0. 

Consider  a  cyclic-prefixed  single-carrier  block  transmission  system  operating  over  such  a 
channel.  In  order  to  avoid  interblock  interference,  we  assume  the  length  of  the  cyclic  prefix 
(CP)  to  be  larger  or  equal  to  the  length  of  the  channel.  At  the  receiver,  after  removing  the 
CP,  the  baud-sampled  discrete-time  baseband  signal  model  for  each  received  block  (we  omit  the 
block  index  for  notational  simplicity)  is 

L—l 

y(ri)  =  h{n\  i)s{n  —  l)  +  v(n),  n  =  0---N  —  l  (1) 

e=o 

where  N  is  the  length  of  the  block,  h{n\  £)  is  the  time-varying  fth  tap  of  the  channel,  L  —  1  is 
the  order  of  the  channel  in  number  of  samples,  and  {s(n)}  is  the  transmitted  block.  Because 
of  the  CP  s(—i )  =  s(N  —  i),  i  =  1  •  •  •  L  —  1.  We  assume  that  the  transmitted  symbols  s(n)  are 
zero-mean  and  independent  of  the  zero-mean  noise  v(n). 

Now,  we  use  a  BEM  to  model  the  time-varying  channel.  We  focus  here  on  the  exponential 
basis  functions.  Under  the  assumption  that  the  delay  and  Doppler  spreads  are  bounded  by  Tmax 
and  fmax  respectively,  the  time- varying  channel  can  be  modelled  for  n  =  0,  •  •  •  ,N  —  1  as 

Q!  2 

h{n-£)=  Y,  \,eej2nqn/N,  £  =  0,...,L~1  (2) 

q=-Q!  2 

where  L  and  Q  satisfy  the  following  conditions: 

(L-l)T 

—  7 'max  Q/(NT )  >  2 fmax 

where  T  is  the  symbol  period.  We  also  assume  that  N  »  L(Q  +  1). 

Using  eq.  (2),  the  signal  model  in  eq.  (1)  can  be  written  in  vector  form  as 

Q/2 

y=  Y  (3) 

q=-Q/2 
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where  Dg  :=  diag(l,---  , e3‘27Tql'N  1^N),  Hq  is  an  (N  x  N )  circular  matrix  with  first  column, 
[hqfl,  hQti,  0, 0] and  s  is  the  (N  x  1)  transmitted  block.  Equivalently,  y  can  be 

rewritten  as 

<3/2 

V  =  BqShq  +  v  (4) 

<7=— <3/2 

where  S  is  the  leading  (N  x  L)  matrix  of  the  (N  x  N)  circular  Toeplitz  matrix  whose  first  column 
is  s ,  and  hq  =  [hqi o, ...,  hqiL- i]T-  In  a  more  compact  form,  y  can  be  expressed  as 

y  =  D[Iq+i  ®S]h  +  v  (5) 

where  D  =  [D_Q/2  •  •  •  D Q/2],  h  =  [hT_Q/2  •  •  •  hTQ/2}T . 

0.3  Channel  Estimation  using  Data-Dependent  Superimposed 
Training 

In  a  TDM  scheme,  some  of  the  entries  of  s  are  known  pilots.  In  the  conventional  ST  scheme,  a 
known  training  sequence,  c,  is  added  to  the  data  vector,  w,  i.e.,  s  =  w  +  c.  The  data  symbols 
are  assumed  to  be  zero-mean,  i.i.d.  random  variables  drawn  from  a  finite  alphabet,  e.g.,  PSK 
or  QAM;  let  denote  the  data  symbol  power.  The  channel  coefficients  can  be  consistently 
estimated  using  the  first-order  statistics  of  the  received  signal  [5,  6].  A  disadvantage  of  this 
method  is  that  the  channel  estimate  is  degraded  by  the  embedded  unknown  data,  which  acts  as 
a  source  of  input  noise.  To  mitigate  this  problem,  we  use  the  DDST  approach  [8],  where  w  is 
distorted  prior  to  adding  the  known  training  sequence.  Let  m  =  w  +  e  be  the  distorted  data 
vector,  where  e  is  a  zero-mean  data-dependent  sequence.  With  s  =  w_-\-  c,  y  can  be  written  as 

y  =  D[Iq+i  <8 )C\h  +  D[Iq+i 

where  C  and  W  are  defined  similar  to  S  in  (4).  The  linear  least  squares  (LLS)  channel  estimate, 
which  regards  the  data-related  term  on  the  RHS  of  the  above  equation  as  noise,  is  given  by 

h  =  (D[Iq+!  ®  C])^  y.  (6) 


0.3.1  Identifiability 

The  channel  estimator  in  eq.  (6)  is  consistent  iff  the  following  identifiability  condition  is  satisfied 

rank  {D[Iq+1  <8  C]}  =  L(Q  +  1).  (7) 


Equivalently, 

rank  {[D_q/2C,  ...,DoC,  ...,Dq/2C]}  =  L(Q  +  1). 
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Recall  that  C  is  the  N  x  L  leading  submatrix  of  a  circulant  matrix,  and  the  D’s  by  definition 
are  N  x  N  full  rank  matrices.  Equivalently  N  >  L(Q  +  1),  C  must  have  full  column  rank  L, 
and  the  Q  +  1  sub-matrices  must  be  orthogonal  to  one-another,  i.e.,  C*D^D„C  =  0,  m,n  = 
— Q/2 , Q/2,  rn  /  n.  We  note  that  such  a  condition  would  be  required  even  if  the  exponential 
bases  were  replaced  by  an  arbitrary  orthonormal  basis  set.  For  the  exponential  basis  set,  the 
necessary  and  sufficient  conditions  are 

Result  1  Channel  identifiability  is  ensured  iff  N  >  L(Q  +  1),  the  training  sequence  has  at 
least  L  non-zero  tones,  and  C*DqC  =  0,  q  =  ±Q , ...,  ±1. 

Let  Dg  =  diag  (c)  with  c  being  the  DFT  of  c.  Then  a  sufficient  condition  is  that  D?  J9Dc  =  0, 
q  =/=■  0,  where  J  is  the  circular  shift  matrix  operator.  Thus,  the  training  sequence  must  satisfy 
an  interesting  shift-orthogonality  in  the  frequency  domain. 

In  the  following,  we  refer  to  the  indices  of  the  nonzero  entries  of  c  as  pilot  frequencies. 
Let  V  denote  the  subset  of  {0,  •  •  •  ,  N  —  1}  containing  these  pilot  frequencies  and  let  P  denote 
its  cardinality.  Note  that  a  necessary  but  not  sufficient  condition  for  channel  identifiability  is 
P  >  L. 

Corollary  1  If  P  =  L,  then  channel  identifiability  is  guaranteed  if  the  pilot  frequencies  are 
spaced  at  least  ( Q  +  1)  apart. 

We  make  the  following  remarks 

•  Corollary  1  implies  that  L(Q  +  1)  unknown  channel  coefficients  can  be  identified  with 
only  L  pilot  tones.  When  Q  =  0,  this  is  the  standard  result:  the  training  sequence  must 
have  at  least  L  tones  to  cope  with  the  unknown  possibly  annihilating  L  —  1  channel  zeros. 
For  Q  >  0,  this  identifiability  is  possible  thanks  to  the  frequency  diversity  (or  frequency 
spread)  offered  by  the  time-varying  channel  and  enabled  by  the  “shift-orthogonal”  training 
sequence. 

•  A  channel  identifiability  condition  that  is  independent  of  V  is  P  >  (L  —  1)Q  +  1.  This  is 
required  when  the  pilot  frequencies  are  cyclicly  contiguous. 

0.3.2  Data-Independent  Channel  Estimation  Condition 

In  order  for  h  to  be  independent  of  the  data,  the  following  condition  must  be  satisfied 

[Iq+i  <8  C]H-DH-D[IQ+1  ®  W]  =  0  (8) 

which  can  be  equivalently  expressed  as 

CH  DqW  =  0,  q  =  -Q,  •••  ,Q  (9) 
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Using  the  same  reasoning  as  in  the  previous  subsection,  condition  (9)  can  be  expressed  in 
the  frequency  domain  as 

N- 1 

^2c*(m)w{<m  +  q>N)ej2nme/N  =  0 

m= 0 

q  =  —Q-  •  •  •  ,  Q.£  =  —L  +  1,  •  •  •  ,  L  —  1  (10) 

where  w  is  the  DFT  of  w,  and  <  •  >jv  denotes  arithmetic  modulo  N. 

Let  Z  be  the  subset  of  {0,  ■  ■  ■  N  —  1}  containing  the  indices  of  the  DFT  entries  of  w  involved 
in  Condition  (10),  and  let  Z  denote  its  cardinality.  Note  that  Z  depends  on  V.  More  specifically, 
if  k  G  V.  then  {<  k  —  Q  >jv,  •  •  •  ,  <  k  +  Q  >n}  C  Z.  Condition  (10)  imposes  (2 L  —  1)(2 Q  +  1) 
constraints  on  Z  elements  of  w  indexed  by  Z.  Therefore,  the  number  of  effective  constraints  on 
w  (or  w)  is  min (Z,  (2 L  —  1)(2 Q  +  1)).  By  keeping  P  to  the  minimum  value  for  a  given  pilot 
placement  scheme,  it  can  be  shown  that  min(Z,  (2 L  +  1)(2 Q  +  1))  =  Z.  In  this  case,  Condition 
(10)  implies  that  the  n  G  Z- th  DFT  entries  of  w  must  be  set  to  zero.  Hence,  in  what  follows, 
P  will  be  kept  to  the  minimum  value.  We  now  make  the  following  remarks. 

•  If  the  pilot  frequencies  are  cyclicly  contiguous,  the  minimal  value  of  P  that  guarantees 
channel  identifiability  is  P  =  (L  —  1  )Q  +  L,  as  mentioned  in  the  previous  subsection.  In 
this  case,  Z  consists  of  only  one  cluster  of  size  Z  =  (L  +  1)Q  +  L. 

•  In  the  case  where  L  pilot  frequencies  are  spaced  at  least  (Q  +  1)  apart,  as  in  Corollary  1, 
Z  consists  of  L  disjoint  clusters  of  size  2 Q  +  1,  and  Z  =  L(2Q  +  1).  Since  Z  in  this  case 
is  larger  than  that  obtained  in  the  case  of  contiguous  pilot  frequencies,  data  distortion  is 
also  greater.  However,  as  we  will  see  in  the  next  section,  designing  the  pilot  frequencies  to 
be  contiguous  is  worst  when  performance  of  channel  estimation  is  concerned. 

•  Note  that  Z  >  P  for  time- varying  channels,  unlike  the  case  of  time-invariant  channels  (i.e., 
Q  =  0)  where  Z  =  P  regardless  of  V  [8] . 

•  In  the  presence  of  a  DC-offset,  it  is  preferable  that  V  does  not  include  the  zero  frequency 
in  order  to  decouple  channel  and  DC-offset  estimation.  To  make  DC-offset  estimation 
data-independent,  the  zero  frequency  should  be  added  to  Z. 


0.4  Optimum  Training  Sequence  Design 

In  the  case  of  purely  frequency- selective  channels,  it  was  shown  in  [?]  that  designing  c  so  that 
its  DFT  has  only  L  non-zero  entries  which  are  equally  spaced  and  have  the  same  magnitude 
is  optimal  in  terms  of  minimizing  the  mean  square  error  (MSE)  of  the  LLS  channel  estimate 
and  minimizing  data  distortion.  For  doubly-selective  channels,  the  design  of  c  is  not  as  simple 
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because  as  we  will  see  later,  minimizing  the  MSE  of  h  under  Condition  (9)  does  not  minimize 
data  distortion  and  vice-versa. 


0.4.1  Minimizing  the  MSE  of  Channel  Estimate 

Since  v  is  AWGN,  the  MSE  of  h  is,  under  Condition  (9),  given  by 

mse  ^fij  :=  Tr  |e  j(h  —  h)(h  —  h)^|| 

=  u2Tr  {([IQ+1  0  CH]DHT>[lQ+l  0  C])'1}  . 

We  have  the  following  inequality 

mse  (h)  >  a2[L{Q  +  1)] “1Tr  { [IQ+1  0  Ch]VhT>[Iq+1  0  C}} 

with  equality  iff 

[Iq+i  0  C//]DwD[Iq+i  0  C\  oc  I 

which  is  equivalent  to 

CHY)qC  oc  S(q)I,  q  =  -Q,---,Q. 

Using  the  same  reasoning  as  in  the  previous  section,  the  above  condition  becomes 
N-l 

J2  c*(m)c(<  m  +  q  >w)e-?27rm^//JV  =  cr25(q)5(i), 

m= 0 

q  =  ~~Q) '  ■  ■  iQ]  £  —  —L  +  1,  •  •  •  ,  L  —  1. 

A  simple  design  that  satisfies  the  above  condition  is 

|c(fc)|2  =  k  =  (),•••  ,N-  1 

i= o 

and 

N  =  PM ,  0  <  t  <  M  -  1  and  M  >  Q  +  1 


(12) 


(13) 


where  P,  t  and  M  are  positive  integers.  This  design  consists  of  P  equispaced  tones,  at  least 
Q+ 1  apart.  When  M  <  Q,  Condition  (12)  can  still  be  satisfied  but  in  this  case,  the  phases  of 
the  c(m)  would  have  to  be  constrained  as  well.  However,  the  training  design  in  this  case  is  not 
interesting  since  P  should  be  minimized  in  the  DDST  approach,  as  we  will  see  later.  Note  that 
in  the  presence  of  a  DC  offset,  t  should  be  chosen  nonzero  in  order  to  decouple  channel  and  DC 
offset  estimations  [15]. 

Note  that  the  identifiability  condition  in  Lemma  1  is  equivalent  to  P  >  L(Q  +  1).  Using  eq. 
(13),  the  minimum  MSE  is  given  by 


mse 


(T2  L[Q  +  1) 


(14) 
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It  is  worth  noting  that  the  minimum  MSE  is  not  a  function  of  P,  the  number  of  non-zero  entries 
of  c.  Thus,  P  should  be  set  to  its  minimum  value,  L,  in  order  to  minimize  data  distortion. 
Recall  that  when  L  pilot  frequencies  are  spaced  at  least  (Q  +  1)  apart,  the  number  of  zeroed 
entries  of  w  is  Z  =  L(2Q  +  1). 

It  is  worth  pointing  out  that  for  the  optimal  design  in  eq.  (13),  the  channel  estimate  in  eq. 
(6)  reduces  to 

hq  =  \cHn»y,  q  =  —Q/2,  •  •  •  ,  <2/2 
The  coefficients  of  hq  can  also  be  simply  expressed  as 


where 


hq/  =  ^2  E  ~c*(kM  +  t)uq,kej2^kM+t)/N 

c  k= 0 

N—l 

Uq  k  =  J2  y(n)e~j2™(q+kM+tVN 

n=0 


(15) 


0.4.2  Minimizing  Data  Distortion 

In  order  to  minimize  data-distortion,  we  choose  w  which  minimizes  the  Euclidean  distance 
between  w  and  w  under  the  constraint  that  the  DFT  entries  of  w.  at  the  frequencies  Z  are 
identically  zero.  Using  Parseval’s  theorem,  this  is  equivalent  to  minimizing 

E  I w(k)  -  w{k)\2  +  E \w{k)\2 

k^Z  k£Z 

over  {w(k),  k  G  Z}.  The  minimum  is  obtained  when  w_(k)  =  w{k)  for  all  k  ^  Z.  Thus, 


w  =  (I  —  4* )  w 

with  $  =  F^T^F  where  T z  is  obtained  after  setting  the  k  G  i?th  diagonal  entries  of  the 
( N  x  N)  identity  matrix  to  zero.  The  power  of  data  distortion, 


E  {\\w  —  w\\2}  =  E  {|| || 2} 

is,  under  the  assumption  of  i.i.d.  data  symbols,  given  by  Za Thus,  data  distortion  increases 
with  Z  but  is  not  a  function  of  the  placements  of  the  zeroed  DFT  entries  of  w.  This  implies 
that  minimizing  Z  also  minimizes  data  distortion.  In  Subsection  0.3.2,  it  was  shown  that  Z  is 
minimum  when  the  pilot  frequencies  are  cyclicly  contiguous;  Z  =  {L  +  1  )Q  +  1.  However,  this 
pilot  placement  is  not  optimal  for  channel  estimation.  Recall  that  for  the  optimal  pilot  design  in 
eq.  (13)  where  P  =  L,  we  have  that  Z  =  L(2Q  +  1),  i.e.,  (L  —  1)Q  more  zeroed  DFT  entries  than 
the  minimum  value  obtained  with  cyclicly  contiguous  pilot  frequencies.  Note  that  in  the  case 
of  purely  time-selective  channels  (i.e.,  L  =  1),  the  optimum  design  in  eq.  (13)  with  P  =  L  =  1 
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also  minimizes  Z.  Indeed,  in  this  case,  the  optimal  c  contains  only  one  non-zero  element  at  an 
arbitrary  frequency  i,  Z  =  {<  i  —  Q  > ^ ,■■■,<  i  +  Q  > jy}  and  Z  =  2 Q  +  1.  In  the  presence 
of  a  DC-offset,  it  is  preferable  to  select  i  such  that  the  above  set  of  frequencies  does  not  include 
the  zero  frequency  [15]. 


0.5  Linear  Equalization  and  Data  Detection 


Since  the  Tig's  are  circular  matrices,  they  can  be  diagonalised  using  the  DFT  matrix,  i.e.  TLq  = 
FHHgF  where  Hg  =  diag  ( Hq(n ),  n  =  0,  •  •  •  ,  N  —  1)  with  Hq(n)  =  J2e=o  hq,e  exp(—  j2ir£n/N). 
Therefore,  left-multiplying  y  in  eq.  (3)  by  F  and  using  the  matrix  manipulations  in  subsection 
3.1.,  we  obtain 

Ql  2 

F y=  -EHqFs  =:  HF a  (16) 


q=-Q/ 2 

.  Thus,  the  MMSE  equalizer  of  s  is  given  by 


a = FHGFy 


(17) 


where  G 


Hff(HHw  +  (JyX)  1 .  The  soft  decision  of  w  is  then  given  by 


w  =  s  —  c 


The  above  block  MMSE  equalizer  can  be  replaced  by  the  low-complexity  approximation  in 
[16].  Further,  iterative  methods  such  as  those  proposed  in  [17]  can  also  be  implemented.  Such 
methods  were  shown  to  outperform  MMSE  equalization  because  they  better  take  advantage  of 
the  frequency  and  time  diversity  of  the  time-varying  channel. 

Due  to  data  distortion  at  the  transmission,  w  is  different  from  w  even  in  the  absence  of 
channel  estimation  error  and  noise.  Indeed,  in  this  ideal  scenario,  w  =  (I  —  &)w.  Since  (I  —  3>) 
is  singular,  w  cannot  be  recovered  linearly.  However,  using  the  fact  that  the  data  symbols  are 
drawn  from  a  finite  alphabet  and  that  is  small  compared  to  w,  symbol  detection  can  be 
undertaken  by  finding  the  vector  of  constellation  points  w  that  minimizes  the  Euclidian  distance 
between  w  and  (I  —  &)w.  This  sequence  detection  scheme  is  computationally  cumbersome. 
Further,  if  sequence  detection  were  to  be  used,  then  maximum  likelihood  detection  (such  as 
sphere  decoding)  should  be  preferred  to  linear  equalization.  Here,  we  proposed  the  following 
iterative  symbol-by-symbol  detection  scheme. 

The  symbol-by-symbol  detection  algorithm  is  initialized  by  treating  as  an  extra  additive 
noise,  and  considering  w  as  a  soft  detector  of  w,  the  initial  hard  detector  of  w  is  given  by 

=  [uj 
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where  [ttj  denotes  the  vector  of  constellation  points  that  are  the  closest  to  the  vector  u.  The 
detected  symbols  are  used  to  estimate  <&w  to  be  used  in  the  next  iteration.  The  detected 
symbols  at  the  ith  iteration  are  given  by 

=  L«  +  *w(i_1)J. 

As  we  will  see  in  the  simulation  section,  the  main  gain  in  symbol  detection  performance  over 
existing  ST-based  methods  is  obtained  with  -uhE 

0.6  Simulation  Results 

We  compare  the  proposed  DDST  scheme  with  the  TDM  scheme  proposed  in  [14]  in  terms  of 
channel  estimation  performance  and  bit  error  rate  (BER).  The  length  of  the  data  block  is  set 
to  N  =  256.  The  time-varying  channel  is  assumed  to  be  of  order  L  =  3  and  generated  using 
Jakes  model  with  a  normalized  Doppler  frequency  fo .  Two  values  of  fo  are  considered  here: 
fo  =  0.003  and  fo  =  0.005.  The  channel  coefficients  are  assumed  uncorrelated  and  their  powers 
are  given  by  the  exponential  delay  profile  E  { | h(n;  £)|2}  =  exp(— 0.2£),  \/q.  The  exponential  basis 
function  model  for  the  channel  is  used  at  the  receiver  for  channel  estimation  with  Q  =  2  dJV], 
For  the  values  of  fo  mentioned  above,  we  have  that  Q  =  2  and  Q  =  4.  For  both  schemes, 
we  use  MMSE  equalization.  The  training  sequence  for  the  DDST  method  is  the  shifted  chirp 
sequence  given  in  Section  4.1  and  its  power  is  set  to  10%  of  the  total  transmit  power.  For  the 
TDM  method,  zero-guarded  pilots  are  uniformly  placed  within  the  block  as  in  [14]. 

The  merits  of  the  two  methods  are  assessed  using  500  Monte-Carlo  runs.  Figures  1  and  3 
show  the  normalized  mse  on  channel  estimation  which  is  defined  as 

J2n=o  Efc o1  \h{n;£)  -  h(n;£) |2 

ESESlM«;4l2 

for  different  values  of  the  data  rate  loss  of  the  TDM  method.  Note  that  /i(n;  £)  is  obtained  using 
the  BEM  and  the  hq/s.  Figures  2  and  4  shows  the  BER  performance.  The  MSE  and  BER  level 
off  at  high  SNR  because  of  the  channel  modelling  mismatch  due  the  BEM  approximation.  It  is 
seen  that  the  proposed  method  outperforms  the  TDM  method  in  terms  of  channel  estimation. 
It  also  compares  favorably  with  the  TDM  method  in  terms  of  the  BER.  Recall  that  the  proposed 
method  does  not  incur  any  data  rate  loss  apart  from  the  periodic  cyclic  prefix  insertion. 

Simulation  results  also  show  that  unlike  the  case  of  time-invariant  channels,  the  iterative 
scheme  in  the  previous  section  does  not  seem  to  provide  any  significant  improvement.  This  is 
due  to  the  fact  that  the  BER  at  high  SNR  is  dominated  by  the  channel  modelling  mismatch. 
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0.7  Conclusions 


We  extended  the  data-dependent  superimposed  training  scheme  in  [8]  to  time-varying  channels. 
We  have  derived  conditions  for  channel  identifiability  and  zero-interference  between  pilots  and 
data.  The  latter  was  achieved  without  trading  off  data-rate.  The  only  penalties  were  a  slight 
decrease  in  data-to-noise  power  ratio  and  a  slight  data  distortion  which  was  mitigated  using 
iterative  symbol  decoding. 
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Normalised  mse  of  channel  estimates 


Figure  1:  Empirical  Mean  square  error  of  channel  estimates 
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N=256,  L=3,  fn=0.003,  Q=2,  c2=10% 

D  c 


Figure  2:  Bit  error  rate 


Figure  3:  Empirical  Mean  square  error  of  channel  estimates 
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Figure  4:  Bit  error  rate 
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